RVF Data Processing Pipeline - Main Script
Overview
The targets contained in the predictor_data_processing_targets.R script integrate environmental data from the Africa Environmental Data Pipeline with Rift Valley fever (RVF) outbreak records from the WAHIS (World Animal Health Information System) database to create a unified dataset for epidemiological modeling.
Pipeline Components
1. Data Import
- Continental Boundaries: Creates Africa polygon and raster template (0.1° resolution)
- Base Predictors: Downloads processed environmental data from AWS
- RVF Outbreaks: Retrieves RVF outbreak records from WAHIS
- Response Variable: Generates RVF response data based on outbreak records
4. Data Integration
- Joins RVF response data with environmental predictors
- Creates unified dataset for modeling
5. Data aggregation
(In progress - missing lagged predictors)
- Imports administrative boundaries (e.g., South Africa districts)
- Aggregates pixel-level data to administrative units
- Outputs district-level datasets for analysis
- Lags important predictors
Key Features
- Multi-project Architecture: Uses targets projects to separate data acquisition from analysis
- CAPSULE Support: Manages R package dependencies for reproducibility
- AWS Integration: Syncs data with cloud storage
- Modular Design: Separate sections for different processing stages
- Error Handling: Continues execution even if some components fail
Output
The pipeline produces district-level ready for epidemiological modeling datasets by:
- Fetching environmental predictors
- Adding RVF outbreak information
- Aggregating data to administrative boundaries
- Lagging importat predictor variables